Branch Write Back SRU FPU 3 FPU 2 FPU 1 FX

نویسنده

  • Nathalie Drach-Temam
چکیده

This paper presents the performance of DSP, image and 3D applications on recent general-purpose microprocessors using streaming SIMD ISA extensions (integer and oating point). The 9 benchmarks benchmark we use for this evaluation have been optimized for DLP and caches use with SIMD extensions and data prefetch. The result of these cumulated optimizations is a speedup that ranges from 1.9 to 7.1. All the benchmarks were originaly computation bound and 7 becomes memory bandwidth bound with the addition of SIMD and data prefetch. Quadrupling the memory bandwidth has no eeect on original kernels but improves the performance of SIMD kernels by 15-55%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recycle of Immobilized Endocellulases in Different Conditions for Cellulose Hydrolysis

The immobilization of cellulases could be an economical alternative for cost reduction of enzyme application. The derivatives obtained in the immobilization derivatives were evaluated in recycles of paper filter hydrolysis. The immobilization process showed that the enzyme recycles were influenced by the shape (drop or sheet) and type of the mixture. The enzyme was recycled 28 times for sheets ...

متن کامل

Microsoft Word - FPU.docx

Floating-point arithmetic units (FPU) have paramount importance in applications that involve intensive mathematic operations. However, previous implementations of FPU either require much manual work or only support special functions (e.g. reciprocal, square root, logarithm, etc.). In this paper, we present an automatic method to synthesize general FPU by aligned partition. Based on the novel pa...

متن کامل

Method for Ultra-precision FPU Integration based on Fine-Grained Control

In general, the FPU and processor are decoupled in the method for FPU integration, in which the communication between them requires software intervention and ultra-precision FPU is unsupported. To avoid this problem, a method based on fine-grained control for integration of FPU into the RISC processor is proposed in this paper. In terms of operand width of floating-point instructions, the metho...

متن کامل

IBM PowerPC 440 FPU with complex-arithmetic extensions

The PowerPCt 440 floating-point unit (FPU) with complexarithmetic extensions is an embedded application-specific integrated circuit (ASIC) core designed to be used with the IBM PowerPC 440 processor core on the Blue Genet/L compute chip. The FPU core implements the floating-point instruction set from the PowerPC Architecturee and the floating-point instruction extensions created to aid in matri...

متن کامل

Resonant normal form for even periodic FPU chains

We investigate periodic FPU chains with an even number of particles. We show that near the equilibrium point, any such chain admits a resonant Birkhoff normal form of order four which is completely integrable—an important fact which helps explain the numerical experiments of Fermi, Pasta, and Ulam. We analyze the moment map of the integrable approximation of an even FPU chain. Unlike the case o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001